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METHOD AND SYSTEM FOR LIMITING As another example of * query of a database, a user might 

THE CARDINALITY OF AN SQL QUERY want the RDBMS to retrieve a result set comprising images 

RESULT that resemble an input image, possibly ordered by 

resemblance, where resemblance is determined by a user- 
BACKGROUND OF THB INVENTION 5 defined function on the image data type. 

1 Field of the Invention ltxo &u*d by the present invention, while highly 

_ . effective as a database search language, it happens that SQL 

The present invention relates generally to database man- m&kes no provision for explicitly Um iti n g the size 

agement system query engines, and more particularly to CcardiDaIity) ^ of reBuIt are generated by an 

enhancing the efficiency of query processing m a database 10 ^ D BMS ^ r ^ to UBef ieR InBtead when it ^ 

management system by limitiag the query result by a dcsired to ^ m< , aiimb&r of ^ m ft ^ ^ ^ 

query-specified cardinality, application program must request the entire result set from 

2. Description of the Related Art RDBMS and then fetch only the desired number of tuples 

A relational database management system (RDBMS) is a from the result set. 

computer database management system that uses relational " Thus, in the first example SQL does net support a query 

techniques for storing and retrieving data, Relational data- that asks for only the ten images that most resemble the input 

bases are computerized information storage and retrieval image. In the second example there is no way to submit and 

systems in which data in the form of tables (formally, execute a request for only the ten highesUgrossing videos, 

"relations") are typically stored for use on disk drives or ordered by income, 

similar mass data stores. A "relation" includes a set of rows » Unfortunately, the above-described shortcoming means 

(formal y, "tuples" or "records") spanning several columns that, f or any ^ry, the RDBMS must be ready to generate 

(formally, "attributes"). A "tuple" expresses a mathematical an eDtire resuIt ^ even though only a subset of the query 

relation between its "attributes" (column elements), while a refiu i t j s desired, with the application program simply trim- 

"record" does not. Reference is made to C. J. Date, An ^ ^ ^salt set down to the desired size. Thus, far the 

Introduction to Dalahase Systems, 6th Ed., Addision- " fl rgtexaiD pi e a bov^ ©very fanigein the datobase that salis- 

Wesley, Reading, M. A. (1994) for general treatment of Ihe m a particular SQL-generated query plan, and not just the 

relational database art "closest* ten images, would likely be processed into the 

An RDBMS is structured to accept commands to store, result set. Likewise, in the second example, the entire set of 

retrieve and delete data using high-level query languages revenue-generating videos in the database would be pro- 

such as the structured query language SQL (SQL). The term 30 cessed into the result set 

"query* denominates a set of commands far retrieving jlata Means are available in SQL for implicity limiting result 

from a stored database. The SQL standard has been prdmul^ gc t f or example, by limit testing. However, the efficacy 

c gajed by:^ 0 f tbis technique depends upon making a good, informed 

Reference is made; for example, to the SQI^92 Standard ^ ss as to how a specified limit divides the set of all 

-"Database Language SQL" published by the .ANSI as ANSI 35 possible results. SQL also provides means to limit the size 

1X3.135-1992 and- published- by-tbe ISO as ISO/IEC of a subset of results mat are returned to a user by specifying 

79075:1992 for the official specification of the 1992 version the size of a set of result items transferred from the RDBMS 

of-the"Structured Query Language. _ , to the user from a result set generated by the RDBMS in 

As used herein, a "query" refers to a set of user commands 4Q response to the user's query. Neither of these SQL 

for retrieving data from a stored database. SQL is used to mechanisms, however, provides an explicit limit on the size 

communicate queries to an RDBMS. SQL requires the of a result set that may be declared in a query, 

return of a particular result set in response to a particular As a consequence of entirely processing a query even 

query, but the method of query execution ("Query Execution when only a limited response is necessary, the response time 

Plan") employed by the RDBMS is not specified by the 45 of the RDBMS to execute the query can be prolonged 

query. There are typically many different useful execution unnecessarily, computing time wasted, and processing 

plans for any particular query, each of which returns the capacity reduced for concurrently executing multiple que- 

required result set. For large databases, the execution plan ries. Furthermore, the continuing growth in size and usage of 

executed by the RDBMS to execute a query must provide relational databases compounds these problems. Moreover, 

the required data return at a reasonable cost and time and JQ by not incorporating a cardinality limitation in query results, 

hardware resource. Almost all RDBMSs include a query an RDBMS cannot exploit additional cardinality-related 

optimizer to translate queries into an efficiently executable information that otherwise would be available to it 
plan. Query compilation and optimization for SQL are 

disclosed in detail in US, Pat. Nos, 5,367,675, 5,546,576 SUMMARY OP THE INVENTION 

and 5,546,570, all assigned to the assignee of this applica- J5 ^ pfCseDl tonMcm recognizes the desirability of incor- 

Uon and incorporated in their entirety by this reference, and porating an explicit cardinah'ty-Iirniiing operator in a data- 

^ V ,'tF^ Ui a ??o^° D t L ^;^ 94 ' 532 ' f Icd base language that enables a user to limit the size of a query 

;LI 995 t Pl 0 T r ■ ' , ?' S'* 19 ' 6 ? 2 * B5ued » A P r - 8 ' result. In addition, the present invention understands that 

1997), which is also assigned to the assignee of this appli- modification to existing database management systems 

cation and incorporated m its entirety by this reference. 6Q uat UBe SQL pref&rab i y requiiefi a minmial change to m& 

Queries are submitted by users of the RDBMS. As under- database management systems themselves. In other words, it 
stood in the art, the term "user" has a manifold meaning that is desirable that the specification of a cardinality limit in 
encompasses, for example, a human operator, an application SQL require at most minimal or no changes to other opera- 
program, a remote machine, and so on, tions of a database management system, such as join and 

As one example, a database might contain business data, 65 merge, 

and a user might want to know the past week's gross rental Accordingly, it is an object of the present invention to 

income for rental videos, ordered by income. provide a method and apparatus for explicitly limiting the 



1/24/05, EAST Version: 2.0.1.4 



Ill Hill III 1 1 II I II II II II 1 1 II II 

US005956706A 

United States Patent [u] [u\ Patent Number: 5,956,706 

Carey et al. [45] Date of Patent: Sep. 21, 1999 



[54] METHOD AND SYSTEM FOR LIMITING 
THE CARDINALITY OF AN SQL QUERY 
RESULT 

[75] Id vectors: Michael James Carey, San Jose, Calif,; 

Donald AJan Kossmann, F assail, 
Germany 

[73] Assignee: International Business Machines 
Corporation, Armonk, N.Y 

[21] Appl. No.: 08/853,685 
[22] Filed: May 9, 1997 

[51] Int. CL* G06F 17/30 

[52] U.S. CI 707/2 

[58] Field of Search 707/1-10, HXM04, 

707/200-206 

[56] References Cited 

U.S. PATENT DOCUMENTS 



4,769,772 9/1988 Dwyer 707/2 

5,367,675 U/1994 Chen et al 395/600 

MR804 5/1995 Krishna 395/600 

5,412^06 5/199S Du et al 707/2 

5,530,939 6/1996 Mansfiled, Jr« et al 395/600 

5,546,570 a/1996 McPherson, Jr. el al 395/600 

S,548>7S4 8/1996 Krahesh et al 395/600 

5,548,758 a/1996 Ffranwli a al 395/600 



5J9SJS9 1/1997 Oiaudhuri 707/2 

5.794.228 8/1998 French et al, „ 707/2 

5.794.229 8/1998 French et al. 707/2 

Primary Examiner— Ruay Li an Ho 

Attorney, Agent, or Fim^Gtzy Cary Ware Freidenrich 

[57] ABSTRACT 

A STOP AFTER clause for a relational database manage- 
ment system (RDBMS) structured query language (SQL) 
causes the RDBMS, in response to a query, to return a query 
result having at most a predetermined cardinality specified 
in the query. A query with a STOP AFTER clause is 
processed by the RDBMS by provision of one or more STOP 
operators that are inserted into a query execution plan 
generated by the RDBMS to execute the query. In a con- 
servative policy, STOP operators are inserted in the query 
execution plan such that no tuples that might be required in 
the query result are discarded, In contrast, an aggressive 
policy inserts a STOP operator in the query execution plan 
wherever it is able to provide a beneficial cardinality reduc- 
tion. A RESTART operator is inserted into aggressive policy 
query execution plana to ensure that at least the predeter- 
mined number of tuples are returned in the query result, and 
a final STOP operator is added at or near the root of the plan 
to ensure that at most the specified number of tuples are 
produced. 

25 Claims, 4 Drawing Sheets 
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For each location 
In query plan where 
first column of 
ORDER BY clauti 
can be computed, 
DO 



Aggressive Policy 
Query Plan Generation 
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insert STOP 
operator with 
estimated NST0P 



stimato 
•topping 
cardinality 
N STOP 
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If Scan-Stop 1$ 
In pipelined section 

pull to top of 
pipeline; recompute 
"STOP 





i /-« 


Insert second 
STOP with exact 

cardinality*^ 
at or near root 

of subplan 




Insert RESTART 
at first location 
above STOP where 
all remaining 
predicates are non-reductive 
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